Grid File

What is a Grid File?

A Grid File is a spatial indexing method that divides a space (like a 2D plane) into a grid of rectangular cells, each pointing to a "bucket" that stores the data points located in that cell. Imagine overlaying a grid on a map: each square in the grid stores whatever falls into that area. It's especially useful when data is coming in continuously or in bulk and needs to be efficiently stored and searched. -Where It’s Used Databases for spatial indexing (especially in 2D) Map-based applications for range lookups (e.g., "show all buildings in this area") Scientific simulations or image processing where space is uniformly divided -Strengths Fast range queries: easy to find everything in a specific region by checking only the relevant cells. Simple structure: easy to implement and understand. Can be dynamically expanded as more data is added. Efficient for data that’s evenly distributed in space. -Weaknesses Not ideal for very clustered or unevenly distributed data (some cells may have many points, others none). Cell splitting logic (to avoid overflowing buckets) can become complex. Can use more space compared to tree-based structures when dealing with sparse data. -Example Use If a map is split into a 10x10 grid: A point at (23, 65) goes into the appropriate cell. To search for all points in a rectangular region, only the intersecting cells are scanned—no need to search the whole dataset.

Grid File Operations

Construction:

Initialize grid directory and linear scales for each dimension
Start with single cell pointing to one data bucket

Insertion:

Locate Cell: Use linear scales to find correct grid cell
Add Entry: Insert into the target bucket
Split If Full: If bucket overflows:
- Split linear scale in the needed dimension
- Update grid directory and redistribute entries

Search:

Range Query: Identify overlapping grid cells
Exact Match: Use scales to directly locate cell and bucket

Deletion:

Remove entry from appropriate bucket
Merge or adjust scales if underflows occur (optional, rarely implemented)

Complexities

Operation	Average Case	Worst Case
Bulk Loading	O(n)	O(n²)
Insertion/Deletion	O(1)	O(1)
Range Query	O(m)	O(n)
k-NN Query	O(k.m)	O(k.n)

Insertion:

To insert a point, the corresponding grid cell is computed using its coordinates. The point is then added to the bucket associated with that cell. This direct access makes insertion constant time: O(1), assuming no overflow handling is needed.

Deletion:

Deletion also works in constant time, O(1), by locating the cell where the point lies and removing it from the cell’s bucket. Efficiency depends on maintaining small bucket sizes and a regular grid structure.

k-NN Search:

The k-nearest neighbor search begins at the central cell and expands outward to neighboring cells. In each step, the algorithm checks points in new cells, maintaining the k closest so far. The time complexity is O(k·m), where m is the number of scanned cells.

Range Query:

For a range query, we first find all grid cells that intersect with the query rectangle. Then, we scan the points in those cells and return the ones that fall within the range. The complexity is O(m), where m is the number of intersecting cells.

Grid File:

What is a Grid File?

Grid File Operations

Construction:

Insertion:

Search:

Deletion:

Complexities

Insertion:

Deletion:

k-NN Search:

Range Query: